13 research outputs found
Performance analysis of single slope solar still using sensible heat storage material
Direct sunlight has been utilized long back for distillation of water. For supplying desalinated water to small communities nearby coastal remote areas solar distillation plants are used. Solar stills are easy to construct, can be done by rural people from locally available materials, simple in operation by unskilled personnel, no hard maintenance is required and almost no operation cost. In order to increase the efficiency of a solar still sensible heat storage materials such as marbles, pebbles, blue metal stone, basalt stone etc. We have used to improve the efficiency of solar still. While using the sensible heat storage material distillation process will continue in both day and night
Invariant Slot Attention: Object Discovery with Slot-Centric Reference Frames
Automatically discovering composable abstractions from raw perceptual data is
a long-standing challenge in machine learning. Recent slot-based neural
networks that learn about objects in a self-supervised manner have made
exciting progress in this direction. However, they typically fall short at
adequately capturing spatial symmetries present in the visual world, which
leads to sample inefficiency, such as when entangling object appearance and
pose. In this paper, we present a simple yet highly effective method for
incorporating spatial symmetries via slot-centric reference frames. We
incorporate equivariance to per-object pose transformations into the attention
and generation mechanism of Slot Attention by translating, scaling, and
rotating position encodings. These changes result in little computational
overhead, are easy to implement, and can result in large gains in terms of data
efficiency and overall improvements to object discovery. We evaluate our method
on a wide range of synthetic object discovery benchmarks namely CLEVR,
Tetrominoes, CLEVRTex, Objects Room and MultiShapeNet, and show promising
improvements on the challenging real-world Waymo Open dataset.Comment: Accepted at ICML 2023. Project page: https://invariantsa.github.io
RUST: Latent Neural Scene Representations from Unposed Imagery
Inferring the structure of 3D scenes from 2D observations is a fundamental
challenge in computer vision. Recently popularized approaches based on neural
scene representations have achieved tremendous impact and have been applied
across a variety of applications. One of the major remaining challenges in this
space is training a single model which can provide latent representations which
effectively generalize beyond a single scene. Scene Representation Transformer
(SRT) has shown promise in this direction, but scaling it to a larger set of
diverse scenes is challenging and necessitates accurately posed ground truth
data. To address this problem, we propose RUST (Really Unposed Scene
representation Transformer), a pose-free approach to novel view synthesis
trained on RGB images alone. Our main insight is that one can train a Pose
Encoder that peeks at the target image and learns a latent pose embedding which
is used by the decoder for view synthesis. We perform an empirical
investigation into the learned latent pose structure and show that it allows
meaningful test-time camera transformations and accurate explicit pose
readouts. Perhaps surprisingly, RUST achieves similar quality as methods which
have access to perfect camera pose, thereby unlocking the potential for
large-scale training of amortized neural scene representations.Comment: CVPR 2023 Highlight. Project website: https://rust-paper.github.io
Object-Centric Learning with Slot Attention
Learning object-centric representations of complex scenes is a promising step
towards enabling efficient abstract reasoning from low-level perceptual
features. Yet, most deep learning approaches learn distributed representations
that do not capture the compositional properties of natural scenes. In this
paper, we present the Slot Attention module, an architectural component that
interfaces with perceptual representations such as the output of a
convolutional neural network and produces a set of task-dependent abstract
representations which we call slots. These slots are exchangeable and can bind
to any object in the input by specializing through a competitive procedure over
multiple rounds of attention. We empirically demonstrate that Slot Attention
can extract object-centric representations that enable generalization to unseen
compositions when trained on unsupervised object discovery and supervised
property prediction tasks
Simple Open-Vocabulary Object Detection with Vision Transformers
Combining simple architectures with large-scale pre-training has led to
massive improvements in image classification. For object detection,
pre-training and scaling approaches are less well established, especially in
the long-tailed and open-vocabulary setting, where training data is relatively
scarce. In this paper, we propose a strong recipe for transferring image-text
models to open-vocabulary object detection. We use a standard Vision
Transformer architecture with minimal modifications, contrastive image-text
pre-training, and end-to-end detection fine-tuning. Our analysis of the scaling
properties of this setup shows that increasing image-level pre-training and
model size yield consistent improvements on the downstream detection task. We
provide the adaptation strategies and regularizations needed to attain very
strong performance on zero-shot text-conditioned and one-shot image-conditioned
object detection. Code and models are available on GitHub.Comment: ECCV 2022 camera-ready versio
Self-supervised learning using motion and visualizing convolutional neural networks
We propose a novel method for learning convolutional image representations without manual supervision. We use motion in the form of optical-flow, to supervise representations of static images. Training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose two simpler learning goals: (a) embed pixels such that the similarity between their embeddings matches that between their optical-flow vectors (CPFS), or (b) segment the image such that optical-flow within segments constitutes coherent motion (S3-CNN). At test time, the learned deep network can be used without access to video or flow information and transferred to various computer vision tasks such as image classification, detection, and segmentation. Our CPFS model achieves state-of-the-art results in self-supervision using motion cues, as demonstrated on standard transfer learning benchmarks.
Despite high transfer learning performance, we feel the need to visualize the representation learned by our self-supervised CPFS model. With that motivation we develop a suite of visualization methods and study several landmark representations, both shallow and deep. These visualizations are based on the concept of ânatural pre-imageâ, that is a natural-looking image whose representation has some notable property. We study three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We formulate these into a regularized energy-minimization framework and demonstrate its effectiveness. We show that our method can invert HOG features more accurately than recent alternatives while being applicable to CNNs too. We apply these visualization techniques to our self-supervised CPFS model and contrast it with visualizations of a fully supervised AlexNet and a randomly initialized one.</p
Self-supervised learning using motion and visualizing convolutional neural networks
We propose a novel method for learning convolutional image representations without manual supervision. We use motion in the form of optical-flow, to supervise representations of static images. Training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose two simpler learning goals: (a) embed pixels such that the similarity between their embeddings matches that between their optical-flow vectors (CPFS), or (b) segment the image such that optical-flow within segments constitutes coherent motion (S3-CNN). At test time, the learned deep network can be used without access to video or flow information and transferred to various computer vision tasks such as image classification, detection, and segmentation. Our CPFS model achieves state-of-the-art results in self-supervision using motion cues, as demonstrated on standard transfer learning benchmarks. Despite high transfer learning performance, we feel the need to visualize the representation learned by our self-supervised CPFS model. With that motivation we develop a suite of visualization methods and study several landmark representations, both shallow and deep. These visualizations are based on the concept of “natural pre-image”, that is a natural-looking image whose representation has some notable property. We study three such visualizations: inversion, in which the aim is to reconstruct an image from its representation, activation maximization, in which we search for patterns that maximally stimulate a representation component, and caricaturization, in which the visual patterns that a representation detects in an image are exaggerated. We formulate these into a regularized energy-minimization framework and demonstrate its effectiveness. We show that our method can invert HOG features more accurately than recent alternatives while being applicable to CNNs too. We apply these visualization techniques to our self-supervised CPFS model and contrast it with visualizations of a fully supervised AlexNet and a randomly initialized one.</p